A computational approach to distinguish somatic vs. germline origin of genomic alterations from deep sequencing of cancer specimens without a matched normal

نویسندگان

  • James X Sun
  • Yuting He
  • Eric Sanford
  • Meagan Montesion
  • Garrett M Frampton
  • Stéphane Vignot
  • Jean-Charles Soria
  • Jeffrey S Ross
  • Vincent A Miller
  • Phil J Stephens
  • Doron Lipson
  • Roman Yelensky
چکیده

A key constraint in genomic testing in oncology is that matched normal specimens are not commonly obtained in clinical practice. Thus, while well-characterized genomic alterations do not require normal tissue for interpretation, a significant number of alterations will be unknown in whether they are germline or somatic, in the absence of a matched normal control. We introduce SGZ (somatic-germline-zygosity), a computational method for predicting somatic vs. germline origin and homozygous vs. heterozygous or sub-clonal state of variants identified from deep massively parallel sequencing (MPS) of cancer specimens. The method does not require a patient matched normal control, enabling broad application in clinical research. SGZ predicts the somatic vs. germline status of each alteration identified by modeling the alteration's allele frequency (AF), taking into account the tumor content, tumor ploidy, and the local copy number. Accuracy of the prediction depends on the depth of sequencing and copy number model fit, which are achieved in our clinical assay by sequencing to high depth (>500x) using MPS, covering 394 cancer-related genes and over 3,500 genome-wide single nucleotide polymorphisms (SNPs). Calls are made using a statistic based on read depth and local variability of SNP AF. To validate the method, we first evaluated performance on samples from 30 lung and colon cancer patients, where we sequenced tumors and matched normal tissue. We examined predictions for 17 somatic hotspot mutations and 20 common germline SNPs in 20,182 clinical cancer specimens. To assess the impact of stromal admixture, we examined three cell lines, which were titrated with their matched normal to six levels (10-75%). Overall, predictions were made in 85% of cases, with 95-99% of variants predicted correctly, a significantly superior performance compared to a basic approach based on AF alone. We then applied the SGZ method to the COSMIC database of known somatic variants in cancer and found >50 that are in fact more likely to be germline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminating somatic and germline mutations in tumor DNA samples without matching normals.

Tumor analyses commonly employ a correction with a matched normal (MN), a sample from healthy tissue of the same individual, in order to distinguish germline mutations from somatic mutations. Since the majority of variants found in an individual are thought to be common within the population, we constructed a set of 931 samples from healthy, unrelated individuals, originating from two different...

متن کامل

Identification of high-confidence somatic mutations in whole genome sequence of formalin-fixed breast cancer specimens

The utilization of archived, formalin-fixed paraffin-embedded (FFPE) tumor samples for massive parallel sequencing has been challenging due to DNA damage and contamination with normal stroma. Here, we perform whole genome sequencing of DNA isolated from two triple-negative breast cancer tumors archived for >11 years as 5 µm FFPE sections and matched germline DNA. The tumor samples show differin...

متن کامل

multiSNV: a probabilistic approach for improving detection of somatic point mutations from multiple related tumour samples

Somatic variant analysis of a tumour sample and its matched normal has been widely used in cancer research to distinguish germline polymorphisms from somatic mutations. However, due to the extensive intratumour heterogeneity of cancer, sequencing data from a single tumour sample may greatly underestimate the overall mutational landscape. In recent studies, multiple spatially or temporally separ...

متن کامل

Toward Concurrent Testing for Somatic and Germline Variants in Cancer Patients.

Sequencing for somatic alterations in patients' tumors is being increasingly clinically implemented to detect mutations that may guide therapy. Germline analysis of a cohort of patients undergoing tumor sequencing with matched normal has revealed that a small but significant percentage of these patients have germline variants that confer cancer susceptibility. Clin Cancer Res; 22(16); 3987-8. ©...

متن کامل

FocalCall: An R Package for the Annotation of Focal Copy Number Aberrations

In order to identify somatic focal copy number aberrations (CNAs) in cancer specimens and to distinguish them from germ-line copy number variations (CNVs), we developed the software package FocalCall. FocalCall enables user-defined size cutoffs to recognize focal aberrations and builds on established array comparative genomic hybridization segmentation and calling algorithms. To distinguish CNA...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2018